Spectral centrality measures in complex networks 



Nicola Perra-'^'^ and Santo Fortunato'^ 

' Dipartimento di Fisica, Universita di Cagliari, Italy 
'^Linkalab, Center for the Study of Complex Networks, Cagliari 09129, Sardegna - Italy 
'^Complex Networks Lagrange Laboratory (CNLL), 
Institute for Scientific Interchange (ISI), Viale S. Severo 65, 10133, Torino, Italy 

(Dated: September 23, 2008) 

Complex networks are characterized by heterogeneous distributions of the degree of nodes, which 
produce a large diversification of the roles of the nodes within the network. Several centrality 
measures have been introduced to rank nodes based on their topological importance within a graph. 
Here we review and compare centrality measures based on spectral properties of graph matrices. We 
shall focus on PageRank, eigenvector centrality and the hub/authority scores of HITS. We derive 
simple relations between the measures and the (in)degree of the nodes, in some limits. We also 
compare the rankings obtained with different centrality measures. 

PACS numbers: 89.75.-k, 89. 75. He 

Keywords: Networks, centrahty, matrix spectra 



I. INTRODUCTION 

Complex systems can be represented as networks, 
where the main units of the system become nodes and 
interacting units are connected by edges. The last years 
have witnessed an intense research activity on networks 
by the scientific community, after the discovery that 
many systems in nature, society and technology, turn 
into graphs with peculiar properties [U, In particu- 
lar, many networks are characterized by a heterogeneous 
distribution of the number of neighbors of a node, or 
degree, where nodes with low degree coexist with nodes 
with large degree (hubs). Such heterogeneity is respon- 
sible for a number of remarkable features of real net- 
works, such as resilience to random failures/attacks 
and the absence of a threshold for percolation Q and 
epidemic spreading Q . The presence of nodes with dif- 
ferent degrees means that there is a broad diversification 
of their roles within the graph. Centrality measures are 
designed to rank graph nodes based on their topologi- 
cal importance [6|,13|- Among the most popular central- 
ity measures we mention degree itself, but also measures 
depending on shortest paths between nodes' pairs, like 
node betweenness and closeness. There are as well cen- 
trality measures that depend on spectral properties of 
graph matrices. These measures are important because 
they are usually associated to simple dynamic processes 
taking place on graphs, like diffusion. In particular, the 
PageRank algorithm, proposed by the Google founders 
Brin and Page [1] , managed to turn Google into the lead- 
ing interface between users and the World Wide Web. In 
recent work spectral properties of graph matrices have 
also been used to characterize the participation of nodes 
in network subgraphs (subgraph centrality) [lo| and 
to estimate the bipartitivity of graphs [ll|. 

However, spectral centrality measures have not been 
much investigated in the physics literature. We shall in- 
troduce and review four centrality measures: PageRank, 
eigenvector centrality fl^l and the hub/authority scores 



introduced by Kleinberg for his HITS algorithm p^ . 
These measures are usually adopted on directed graphs, 
we shall as well discuss extensions to the undirected case, 
where applicable. 

In Section [H] we present the measures and describe 
them in some detail. Analytical and numerical results on 
particular graphs will be shown in Section IIIIl whereas 
in Section llVI we shall compare the rankings of nodes of 
real graphs for different centrality measures. Conclusions 
will be reported in Section |Vl 



II. CENTRALITY MEASURES 

The basic matrix of a graph is the adjacency matrix 
A, where the element Aij equals 1, if nodes i and j are 
connected by a link, if they are not. If the network 
is directed, the adjacency matrix is not symmetric. In 
this case, it is necessary to distinguish between two types 
of links adjacent to a node, i. e. links pointing to the 
node (incoming) and links pointing outside (outgoing). 
Therefore, there are two types of degree: indegree, i.e. 
the number of incoming links; outdegree, i.e. the number 
of outgoing links. Likewise, one distinguishes between 
the in-neighbors of a node, i.e. the nodes pointing at the 
node, and the out-neighbors, i.e. the nodes pointed at 
by the node. The directedness of the links has a num- 
ber of important implications, involving both some basic 
structural concepts, like connectivity, and processes tak- 
ing place on the network. For instance, a random walk 
is a stationary process on any undirected graph, but it 
is not in general on a directed graph, due to the possible 
presence of dangling ends, i.e. nodes with zero outdegree, 
that act as sinks for the process. On the other hand, dif- 
fusion leads to a natural definition of centrality, based on 
the frequency that a walker stops by a node during the 
process. In order to make a diffusive process stationary 
on a directed graph, one needs to give the walker the op- 
portunity to leave from a dangling end. PageRank offers 
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a simple solution, which we describe below. 

A. PageRank 

PageRank (PR) is the prestige measure used by Google 
to rank Web pages. It is supposed to simulate the behav- 
ior of a user browsing the Web. Most of the times, the 
user visits pages just by surfing, i.e. by clicking on hyper- 
links of the page he is on; otherwise, the user will jump 
to another page by typing its URL on the browser, or go- 
ing to a bookmark, etc.. On a graph, this process can be 
modelled by a simple combination of a random walk with 
occasional jumps towards randomly selected nodes. This 
can be described by the simple set of implicit relations 

,W4 + (1-.)E^- -1,2,. ..,n (1) 

Here, n is the number of nodes of the graph, p{i) is the 
PR- value of node i, kout{j) the outdegree of node j and 
the sum runs over the nodes pointing towards i. The 
damping factor g is a probability, that weighs the mixture 
between random walk and random jump. On practical 
applications it is usually set to small values (typically 
0.15). For any g > the process reaches stationarity, as 
a walker has a finite (no matter how small) probability 
to escape from a dangling end, whenever it lands there. 
When q = 0, the process may not be stationary and PR 
is ill defined. When q = I, instead, the jumping process 
dominates and all nodes have the same PR- value 1/n. 
PR goes beyond indegree: in order to have a large PR- 
value for a node it is important to have many neighbors 
pointing at a node, i.e. large indegree, but it is also 
important that the neighbors have large PR-values. So, 
if two nodes have equal indegree, the node with more 
"important" neighbors will have larger PR. 

Solving the set of equations ([!]) is equivalent to solv- 
ing the eigenvalue problem for the transition matrix Ai, 
whose element A4ij is given by the following expression: 

PR is just the principal eigenvector of A4, and is usually 
determined with the power method, i.e. by repeatedly 
multiplying the matrix M by an arbitrary vector until all 
the entries of the resulting vector are stable. This is also 
the procedure we adopted to compute the eigenvectors 
corresponding to all centrality measures we studied. 

The literature on PR is very large, because of its huge 
impact on Web search. In one of the first theoretical 
studies [l3| , the dependence of PR on the damping fac- 
tor was investigated. In general, the attention has been 
mostly focused on the graph of the World Wide Web, 
where Web pages are nodes and the hyperlinks their con- 
nections. Comparatively little has been done to study 
the measure on more general classes of networks. A re- 
cent mean field study has shown that the average PR 



value of nodes with the same indegree is a linear function 
of indegree in the absence of degree-degree correlations. 
In another study, some analytical results were found on 
PR distributions on special classes of graphs . In Sec- 
tion IIII Al we shall briefly resume the results of [IGj and 
build up on them. 



B. Eigenvector centrality 

The eigenvector centrality (EV) is also based on the 
principle that the importance of a node depends on the 
importance of its neighbors. In this case the relationship 
is more straightforward than for PR: the prestige Xi of 
node i is just proportional to the sum of the prestiges of 
the neighboring nodes pointing to it 

= ^ Xj =^ ^AjiXj = (A*x)i. (3) 

From Eq. ^ we see that Xi is just the z-component of 
the eigenvector of the transpose of the adjacency matrix 
with eigenvalue A. We notice that the trivial eigenvector 
with all components equal to zero is always a solution 
of Eq. ([3]). The true EV is then associated to the exis- 
tence of non-trivial solutions of the eigenvalue problem 
of Eq. ([3]) . From Eq. ([3]) we see that nodes with indegree 
zero also have zero centrality: in general, nodes pointed 
at by nodes with zero centrality also have zero centrality 
and this effect will propagate to other nodes, so that in 
many cases EV would not give any information about a 
big number of nodes. To avoid this, it is useful to make 
the following modification: to each node we assign a pres- 
tige e, which is independent of its relationships with the 
other nodes. Eq. ([3]) is then modified as follows: 

x^ = a(A*x), + e. (4) 

The role of the parameter e reminds that of the damp- 
ing factor q in PR. The parameter a weighs the relative 
importance of the contribution of the peers versus that 
of the node itself. The new measure is called a-centrality 
(aEV) and is the one we shall investigate in this 
paper. We remark that, in contrast to PR, here the so- 
lutions do not have a natural interpretation in terms of 
probability, so the sum of the a-centralities need not be 
1. However we shall normalize the final values by divid- 
ing them by their sum, so to make them add up to 1 , for 
practical purposes. 



C. HITS scores 

Google's PR was not the first prestige measure for Web 
pages based on the Web's graph representation. Shortly 
before the seminal paper by Brin and Page, Jon Klein- 
berg had proposed another solution to the problem 
of ranking Web sites based on their importance for the 



users. This solution was the HITS algorithm, which dis- 
tinguishes two types of Web pages: hubs and authorities. 
Let us suppose that a user submits a query through a 
search engine. If a page is very relevant for this query, 
one can reasonably expect that it will be pointed at by 
many other pages. However, the simple indegree would 
not allow to discriminate the relevant pages from other 
pages with similar (large) indegree. An important differ- 
ence is that pages pointing to a relevant page are likely to 
point as well to other relevant pages, so to create a sort 
of bipartite structure where relevant pages (authorities) 
are cited by special pages/indices (hubs). Such bipartite 
structures allow to identify the relevant pages for the user 
query. Therefore one assigns two scores to a page i of the 
Web: the huh score Xi and the authority score iji. Pages 
with high authority scores are pointed at by pages with 
high hub scores. In turn, a good hub points at (very) au- 
thoritative pages. This mutually reinforcing mechanism 
is described by the coupled relations 



llXi 



(5) 



(6) 



which can be rewritten in the form of simple eigenvalue 
equations for both x and y by substitution 



A/iXi = (AA*x)i. 



X^y, = (A* Ay), 



(7) 



(8) 



From Eqs. ([7]) and ([5]) we see that the hub and author- 
ity scores are just eigenvectors of the matrices A A* and 
A'^A. We stress that both AA* and A* A are symmetric, 
whether A is symmetric or not. The scores x and y cor- 
respond to the principal eigenvectors of these matrices. 



III. RESULTS 

A. Page Rank 

In (l6j the two main limits of the PR measure, corre- 
sponding to g and g — s- 1, were investigated. An- 
alytical results can be derived for special graphs, such 
as graphs grown with popular mechanisms, like prefer- 
ential attachment [l^- For our proofs we shall focus 
on the model by Dorogovtsev, Mendes and Samukhin 
(DMS) [IBl, which generates graphs with power- law de- 
gree distributions with any exponent larger than 2. In 
this model, at each time step a new node is added to the 
graph and m links are set from the new node to preexist- 
ing ones. The probability that a new node i gets attached 



to a node j (with indegree kj ) is 
n(fcj, a) 




FIG. 1: Subgraph of a tree. The PR- values of all nodes shown 
can be simply calculated. 



where a is a positive constant. When a — m one re- 
covers the recipe of the original preferential attachment 
formulation of Barabasi and Albert In general, the 
exponent of the indegree distribution 7 = 2 + a/m. For 
simplicity, we shall study the special case in which m = 1, 
i.e. each node has outdegree 1 and the network is a tree. 
The results are however independent of m. 



1, The limit g — > 

We assume that q is very small. To the first order in 
q, and remembering that each node has outdegree 1 by 
construction, Eq. ^ takes the following form 



p(^) - - + p{j) *=i,2,. 



(10) 



(9) 



which looks particularly simple, though not generally 
solvable. From Eq. (fTO|) we see that the PR of a node 
equals a constant plus the PR of its in-ncighbors. This 
recipe enables to calculate PR recursively on simple trees, 
as shown in Fig. [1] where we focus on a subgraph of a 
tree. Node A is the root of the subgraph as every walk 
starting on any of the nodes will reach A at some stage. 
We call any node with this property a predecessor of A. 
The PR value of any node of the graph is determined only 
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by its predecessors. In the case illustrated, the calcula- 
tion is particularly simple: we start from the leaves of the 
subgraph (empty circles) whose PR is just q/n because 
they have no incoming links, and move towards A. For 
each node, we apply the relation (|10p . The final values 
are reported next to the nodes. From this example we 
can deduce a number of general properties: 

• all PR values are multiples of the elementary unit 
q/n\ 

• PR increases if one moves from a node to another 
by following a link; 

• the PR of each node i, in units of q/n, equals the 
number of its predecessors. 

Since PR takes only discrete values, in the following we 
shall measure it in units of q/n. We thus indicate the 
distribution with Ppr{1), with 1 — 1,2, n. 

In a dynamic process like network growth, it is crucial 
to see what happens to the PR values / distribution when 
a new node comes into the picture. This is shown in 
Fig. [21 where a new node N is added to the network of 
Fig. [TJ We see that only the nodes encountered along the 
path from N to A, including A, are affected, while the 
others retain their PR values. In particular the presence 
of the node N determines an increase by q/n in the PR 
values of the affected nodes. 

Now we are ready to build a master equation for the 
PR distribution Ppr{1) on a DMS graph. At time n, 
the graph has n nodes and n — 1 links (the root does 
not generate links); the PR distribution is PpR^{l). If 
we add node n + 1 we get a new distribution Ppr"^^{1). 
As we have seen above, the new node will contribute an 
additional q/n to the PR of the nodes in the path from 
n + 1 to the root of the graph. We need to compute the 
balance between the nodes passing from PR l — l to I and 
those passing from I to / + 1. The probability 11" that 
the PR of node i, initially equal to I, will be changed by 
the new node equals the probability that the link set by 
the new node gets attached to one of the predecessors of 
i (including i) and equals 



n" 



a + kj 



E 



^ (a+ l)n- 1' 



(11) 



where j => i means that j is a predecessor of i. None 
of the predecessors of i, other than i can reach PR / + 
1 because of the new node, as their initial values are 
necessarily smaller than I. The number of predecessors 
of i (including i) is I and the total number of adjacent 
links to the predecessors is Z — 1 (one for each predecessor, 
except i). So, 



n" 



a 



{a + 1)1 - 1 



^ (a + l)n - 1 (a+ l)n ■ 

j=>i 



(12) 
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FIG. 2: If a new node N gets attached to any node of the 
subgraph, it adds an equal contribution q/n to the PR of all 
nodes in a path from A'^ to the root. 



The number of nodes with PR I that are affected by the 
presence of the new node and its link is then 



U^il) = nP^j,{lM 



{a +1)1 - 1 



Pppil)- (13) 



(a + l)-l/n^™^ 
and the master equation reads 

{n + l)P^+Hl) - nP^j,{l) = W\l - 1) - n"(/). (14) 



Eq. (|T4|) holds for Z > 1. For I — 1 & modification is 
necessary, as there cannot be nodes with zero PR, so the 
term 11" (0) is not defined. However, since the new node 
has no incoming links, the number of nodes with PR 1 
increases by 1 because of the new node, so we can write 



{n + l)P-l\l)-nP^^{l) = l-n^{l). 



(15) 



The stationarity condition of Eqs. (fT4|) and (fT5|) . in the 
limit of large n leads to the relations 



Ppr{1) 



(a+l)i-a-2 

(a+1) 
g+l 
2a+l • 



if / > 1; 
if / 1. 



(16) 
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FIG. 3: PR distribution for small g on a DMS graph with 10'^ 
nodes, m = 1 and a = 1. In this case the indegree distribution 
is a power law with exponent 7 = 3. 
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FIG. 4: Reduced PR distribution for g ~ 1 on a DMS graph 
with 10^ nodes, m = 1 and a = 1. The curve matches the 
indegree distribution. 



which has the solution 



PPB.{1) 



a{a + 1) 



^,for / > 1. 



[{a + 1)1 + a][{a + 1)1 ^ I] 

(17) 

We see that the PR distribution in the hniit (7 ^ on a 
DMS tree is a power law with exponent 2, for any value 
of the parameter a, including the limit case a — > 00, when 
the indegree distribution becomes exponential. This re- 
sult is confirmed by numerical simulations (Fig. [3]), which 
also show that the hypothesis of the tree is not necessary, 
as long as each node has the same outdegree m. 

In [10] the same result was found for other models of 
netwo rk g rowth, like Barabasi- Albert preferential attach- 

It is possible that 



17| and the Copying Model [li 



ment 

this property holds for general graphs where the flows 
converge towards a central root (sink). Indeed, our find- 
ing agrees with the more general result on the size dis- 
tribution of supercritical trees [20| . Moreover, numerical 
studies have shown that the same behavior holds for the 
graph of Internet, when one considers the distribution of 
the size of the basin connected to a given point [2l| . In- 
deed, our calculation follows the same procedure usually 
adopted for the calculation of the area of basins in river 
networks. 



2. The limit g — > 1 

The case q — 1 is well defined, but trivial, as all nodes 
end up having the same PR- value 1/n. We ask how this 
limit is reached. If g ^ 1, the contribution to PR given 
by the in-neighbors of a node is very small compared to 
the constant term, which is close to 1/n. In order to 
study the behavior of this term, we define the reduced 
PageRank pr{i) of a node i as 



Pr{i) ^P{i) 



q 

n 



1,2,. 



. .,n. 



(18) 



We assume that all nodes have the same outdegree m. 
In this case, to leading order in the infinitesimal 1 — q 
Eq. ID) can be rewritten as 



Pr{i) = — —hn{i), 



i = 1, 2, . . . , n. 



(19) 



where kin{i) is the indegree of i. We conclude that on 
any graph, the reduced PR of a node in the limit q ^ 1 
is proportional to the indegree of the node, if all nodes 
have the same outdegree. This result has been derived 
independently in [23] . As a consequence of Eq. (fT9|) , the 
distribution of the reduced PR for q I has the same 
trend as that of indegree, which can be easily verified 
numerically (FigH]). 



3. Extension to undirected graphs 

PR can be easily extended to undirected graphs as well. 
The corresponding equation reads 



p{i) 



q 

n 



1,2, 



(20) 



where now kj is the degree of node j. For the pur- 
poses of a random walk, undirected links can be crossed 
in both directions, so a pure random walk now always 
reaches stationarity due to the absence of dangling ends. 
In fact, the stationary probability of a random walk on 
a node of any undirected graph is simply proportional 
to the degree of the node [23]. However, in Eq. ([^0]) 
we have still the contribution of random jumping, and it 
turns out that the mixed process is still hard to solve. 
We are not aware of a general solution in this case. In 
the limit g — > PR is now well behaved, and its distribu- 
tion coincides with the degree distribution of the graph. 
In Fig. [5] we show the distributions of reduced PR for 



different values of g on a DMS graph with a power law 
degree distribution and exponent 7 = 3. The reduced 
PR expresses the contribution to PR given by the ran- 
dom walk. We see that the curves follow the decay of the 
degree distribution for any value of q. We have computed 
the reduced PR distribution on many other graphs and in 
all cases we found that they follow the same trend as the 
degree distribution. For example, in Fig. [S]we show the 
comparison between reduced PR and degree for a sample 
of the Web link graph. Here the nodes are Web pages of 
the domain . gov and two pages are connected if there 
is a hyperlink from one to the other. There are 794, 184 
nodes and 6,460,903 links. The graph is directed but 
PR was calculated by neglecting the directedness of the 
links. As we can see, the decay of the distributions of re- 
duced PR resembles that of the degree distribution. The 
graph at hand is not simple like the DMS networks, as it 
presents a large number of loops and community struc- 
ture. Therefore the result is likely to be general. We can 
show this with a simple argument. The general equation 
for reduced PR on undirected graphs is: 

Pr{^) - IT + (1 - 9) 2. (21) 

that we can solve formally by successive iteration, ob- 
taining the general form 

' •' " (22) 
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where is indicates the neighbors of the s-shell of the node 
i; so, ii indicates the nearest neighbors of i, i2 the next- 
to-nearest neighbors, and so on. The last sum in the 
first line of Eq. is, for a given node ig-i, a sum over 
its neighbors is- This sum, that we call T^^, contains 
ki^ terms, ki^ being the degree of node is- The sum Ti^ 
can be approximated as the product ki^{l/k)Njs[, where 
(1/fc) AT TV is the expected value of the average of 1/fc over 
the neighbors of a node of the network. In general, Ti^ = 
ki^{l/k)NN + Visi where rji^ is a random variable with 
mean zero. In this way, it is easy to see from Eq. (22) 
that, for any value of s, the product of sums reduces 
to ki{l/k)NN plus the sum of many random variables 
like rii^ . Due to the Central Limit Theorem, the latter 
sum, if it includes a large number of terms, yields a very 
small value with large probability. We can then conclude 
that, for ki sufficiently large, each term of the series in 
Eq. (j22p is proportional to ki with good approximation, 
therefore Pr{i) is also proportional to fc^, for any value 
of the damping factor q. We have verified numerically 
that this assertion is true for many graphs and degree 
distributions, without finding exceptions. 



FIG. 6: Reduced PR on undirected graphs. Variabihty of 
reduced PR distribution with q on the domain .gov of the 
World Wide Web. The degree distribution has a tail which 
follows fairly well a power law with exponent 2.1. To better 
show the agreement we have shifted the curves such that the 
tails overlap. 



B. Eigenvector centrality 

1. Directed graphs 

The defining Eq. ([4]) is formally analogous to Eq. pO|) . 
The only difference is that the eigenvalue a is not 1 as 
for PR. However, the results of Section IIII A II hold as 
well when the outdegree m is greater than 1 (as long as 
it is the same for all nodes), and in this case the sum 
of Eq. (fTO|l would include a multiplicative factor 1/m, 
which makes it identical to Eq. We then deduce 

that all results found for PR in the limit g ^ hold for 
aEV. Here the results are more general, because we did 
not need to make any approximation to get to Eq. (jj]) 
as we instead needed to derive Eq. (|10p . In particular. 




FIG. 7; Distribution of qEV on a directed DMS graph with 
10^ nodes, m — 1 and a — 1. The dashed line indicates the 
predicted slope. 

it is not necessary that e be very small and the nodes 
need not have the same outdegree, although this is the 
case for the graphs we considered. We conclude that the 
distribution of aEV on DMS graphs has a power law tail 
with exponent 2 (Fig. [7]). The same holds for graphs built 
using preferential attachment and the Copying Model, 
just as it happens for PR in the limit q 0. 



2. Extension to undirected graphs 

On undirected graphs, Eq. (|4]) becomes 

x^ = a(Ax), + e, (23) 

since A* = A. So, the aEV of a node is proportional to 
the sum of the aEV of its neighbors, modulo an additive 
constant e. As we have done for PR, we define the reduced 
a-centrality as 

xl^x,~ e. (24) 

So, we can rewrite Eq. ([24|) as 

a;[ = Q;(Ax'')i + kiae, (25) 

where ki is again the degree of node i. We can apply a 
similar argument as in Section IIII A3I The sum over the 
ki neighbors of i can be approximated as ki(x''), where 
(x^) is the average of the reduced aEV over the whole 
graph. The approximation is the more valid, the larger 
the number ki of summands. In this way, from Eq. (I25p 
we see that the reduced aEV of a node is proportional 
to its degree, if the latter is large enough. This result 
is independent of the specific graph we consider, and we 
have verified it numerically for many types of networks. 
In Fig. [8] we show the distribution of reduced aEV for 
different choices of the parameter e/a for the sample of 
the Web graph we analyzed in Fig. [51 The curves closely 
follow the decay of the degree distribution. 




reduced aEV, degree 



FIG. 8: Reduced qEV on undirected graphs. Variability of 
reduced aEV distribution with e/a on the domain .gov of the 
World Wide Web. The degree distribution has a tail which 
follows fairly well a power law with exponent 2.1. To better 
show the agreement we have shifted the curves such that the 
tails overlap. 




FIG. 9; The authority score of the node in the center is 
proportional to the sum of the authority scores of the out- 
neighbors (blue squares) of the in-neighbors (red circles) of 
the node. 



C. HITS scores 

The meaning of the eigenvalue equations ([7]) and © is 
quite simple. The hub score of a node is the sum of the 
hub scores of the in-neighbors of the out-neighbors of the 
node. The authority score of a node is the sum of the 
authority scores of the out-neighbors of the in-neighbors 
of the node (Fig. [5]) . Let us suppose that the nodes have 
the same outdegree m. The authority score of a node 
i is given by the sum of mkin{i) terms, where kin{i) is 
the indegree of i. In fact, node i has kin{i) in-neighbors, 
each of them having m out-neighbors. If kin{i) is large, 
the number of summands is very large, and can be ap- 
proximated by the average value of the authority score 
over the whole graph, times mkin(i)- This approximation 
is the more valid, the larger m and kin{i). We conclude 
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FIG. 10: Distribution of the authority scores versus indegree 
distribution. (Left) DMS graph with 10^ nodes, m — 10 and 
a = 1. (Right) DMS graph with 10^ nodes, m = 50 and a = 1. 



that on a directed graph with constant outdegree the dis- 
tribution of the authority scores will have the same tail 
as the indegree distribution. This is clearly illustrated in 
Fig. [ini For the hub scores it is not possible to make pre- 
dictions; the sum that delivers the hub score of a node 
cannot be approximated through other graph variables 
in most cases. 

The extension of the HITS scores to the case of undi- 
rected graphs is not interesting. In this case A* = A, 
so A* A = AA^ — A^ and the hub and authority scores 
are identical. Moreover, they coincide with EV, as the 
matrices A and A^ have the same eigenvectors. 



IV. RANKINGS 

In the previous sections we have investigated the distri- 
butions of spectral centrality measures and their similar- 
ities. As we have mentioned in the Introduction, central- 
ity measures are used to rank nodes. In this section we 
shall compare the rankings obtained with different cen- 
trality measures. In order to compare two rankings we 
adopt Kendall's t [24] , a widely used index in this type 
of analysis. Kendall's r ranges from 1 (perfect correla- 
tion) to —1 (perfect anticorrelation) . In Table |T]we show 
the cross-comparisons between all centrality measures we 
discuss in this work, for a DMS directed graph. For com- 
pleteness we have included the outdegree as well. As we 
can see, PR, aEV and the authority scores are well cor- 
related with indegree and with each other, whereas the 
other coefficients are small or negative; aEV has a strong 
correlation with outdegree as well. 

DMS graphs have a fairly regular structure; we have 
seen that in this case the behavior of centrality measures 
is quite regular, and that there are simple relations be- 
tween their distributions, which may be determined by 
simple relations between a measure and indegree at the 
level of the single node. Therefore, we cannot deduce 



iVlcdb III L!b 


7" 


PR-qEV 


0.8192 


PR-AUTH 


0.5774 


PR-HUBS 


0.1213 


PR-IN 


0.6444 


PR-OUT 


-0.3012 


aEV-AUTH 


0.5788 


QfEV-IN 


0.6487 


qEV-HUBS 


0.1220 


aEV-OUT 


0.5788 


AUTH-IN 


0.5458 


AUTH-HUBS 


0.1076 


AUTH-OUT 


-0.2611 


HUBS-IN 


0.1142 


HUBS-OUT 


-0.2126 


IN-OUT 


-0.2507 



TABLE I: Kendall's t for each pair of centrality measures 
computed for a DMS directed graph, with n = 10®, m = 3 
and a = 3. 



general conclusions from Table U and we repeated the 
analysis for two real world networks: a network of po- 
litical blogs and the subset of the Web link graph corre- 
sponding to the URLs of the domain . gov, that we have 
studied in the previous sections. 

The first network is a citation network consisting of 
1490 blogs; 758 are democratic and 732 republican. It 
was first studied by Adamic and Glance [25|, who fo- 
cused on the community structure of the graph, which 
matches that determined by the two political areas. The 



Measures 


r 


PR-qEV 


0.09 


PR-AUTH 


0.14 


PR-HUBS 


0.04 


PR-IN 


0.14 


PR-OUT 


0.02 


aEV-AUTH 


0.12 


qEV-IN 


0.07 


aEV-HUBS 


0.08 


qEV-OUT 


0.01 


AUTH-IN 


0.12 


AUTH-HUBS 


0.07 


AUTH-OUT 


0.01 


HUBS-IN 


0.02 


HUBS-OUT 


0.07 


IN-OUT 


0.07 



TABLE II: Kendall's r for each pairs of centrality measures 
for the network of political blogs studied by Adamic and 
Glance. 

correlations now are rather weak. The small coefficients 
indicate that the rankings differ considerably with the 
measure chosen. To have an idea, in Table IIIII we show 
the Top Ten blogs in the rankings obtained with all cen- 
trality measures. We see that there are clear differences 
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between the listings. 

The resuhs are basicahy the same for the Web graph. 
Table IIVI reports the Kendall's r between the rankings. 
The values are of the same magnitude as for the network 
of the blogs. The Top Ten listings for the Web are shown 
in Table |V] and appear again considerably different from 
each other. 



V. CONCLUSIONS 

Centrality measures are very important to understand 
the properties of the nodes of complex networks and their 
topological roles. We have studied the most important 
centrality measures based on properties of graph matri- 
ces: PageRank, Eigenvector centrality, and the hub and 
authority scores of HITS. All these measures deduce the 
importance of a node in a self-consistent way from the im- 
portance of its nearest neighbors and, in the case of the 
HITS scores, of its next-to-nearest neighbors. We have 
resumed some recent results on PageRank distributions 
on particular types of tree-like graphs. On those graphs, 
the distribution of PageRank in the limit q ^ decays 
as a power law with exponent 2. The same is true for a- 
centrality, because its defining equation is formally equiv- 
alent to the equation for PageRank in the limit g — + 0. 
These results on centrality distributions are likely to be 
true for an extended class of graphs, where there is a 
flow from the outermost nodes (leaves) to a sink. We 
have also seen that, on any graph, in the limit q I, 
the reduced PageRank of a node, i.e. the contribution of 
the random walk process to the measure, is simply pro- 
portional to the indegree of the node, if the nodes have 
(about) the same outdegree. We have studied for the first 
time the extension of PageRank to the case of undirected 
networks, finding that the reduced PageRank of a node 
is proportional to its degree, for large degrees, for any 
graph and value of q. We proposed a simple explanation 
of this effect based on the Central Limit Theorem, and 
verified numerically in several cases that the argument 
holds. Similarly, the reduced a-centrality of a node is 



also proportional to its degree, for large degrees, on any 
graph. With the same type of argument it is possible to 
show that the authority score of a node is proportional 
to its indegree, for large indegrees, when the outdegrees 
of all nodes are (approximately) the same. 

We conclude that there are often strong relations be- 
tween our centrality measured and (in)degree: some re- 
lations hold on particular graphs and/or limits, others 
are more general. These findings imply that the mea- 
sures are often strongly correlated with each other. We 
have indeed seen that the rankings of nodes according 
to the centrality measures we have considered are quite 
close to each other for indegree, PageRank, Eigenvector 
centrality and authority score on graphs built with the 
prescription of Dorogovtsev, Mendes and Samukhin. We 
have shown in the paper that these graphs have special 
properties, and that some measures may be correlated to 
each other. Instead, on real graphs, like the networks of 
political blogs and the sample of the Web graph we have 
considered, the structure is less regular and the measures 
are far less correlated to each other, as confirmed by the 
small values of the Kendall's t for each pair of centrality 
measures. This means that, for practical purposes, and 
in spite of their similarities, spectral centrality measures 
look at nodes from different perspectives, and allow to 
diversify their roles within the network, obtaining in this 
way more information about the importance of nodes. 
The scores computed from spectral centrality measures 
can complement the information about node's central- 
ity derived from more traditional measures like node be- 
tweenness 26[. This is especially important for directed 
graphs, where node betweenness, as well as other mea- 
sures based on geodesic paths, like closeness [131, are not 
well defined. 
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Rank 


PR 


aEV 


Auth 


1° 


dailykos.com, D 


atrios.blogspot.com, D 


dailykos.com, D 


2° 


atrios.blogspot.com, D 


dailykos.com, D 


talkingpointsmemo.com, D 


3° 


instapundit.com, R 


talkingpointsmemo.com, D 


atrios.blogspot.com, D 


4° 


blogsforbush.com, R 


washingtonmonthly.com, D 


washingtonmonthly.com, D 


5° 


talkingpointsmomo.com, D 


talkleft.com, D 


talkleft.com, D 


6° 


michcllcmalkin.com, R 


prospcct.org/wcblog, D 


instapundit.com, R 


7° 


drudgcrcport.com, R 


juancolc.com, D 


juancolc.com, D 


8° 


washingtonmonthly.com, D 


digbysblog.blogspot.com, D 


yglcsias.typcpad.com/matthcw, D 


9° 


powcrlincblog.com, R 


pandagon.net, D 


pandagon.net, D 


10° 


andrcwsullivan.com, R 


yglesias.typepad.com/matthew, D 


digbysblog.blogspot.com, D 


Rank 


Hubs 


In 


Out 


1° 


politicalstrategy.org, D 


dailykos.com, D 


blogsforbush.com, R 


2° 


madkanc.com/notable.html, D 


instapundit.com, R 


newleftblogs.blogspot.com, D 


3° 


liberaloasis.com, D 


talkingpointsmemo.com, D 


politicalstrategy.org, D 


4° 


st agefour . typepad .com/commonprejudicc, D 


atrios.blogspot.com, D 


madkanc.com/notable.html, D 


5° 


bodyandsoul.typcpad.com , D 


drudgercport.com, R 


cayankce.blogs.com, R 


6° 


correntc.blogspot.com, D 


powerlineblog.com, R 


liberaloasis.com, D 


7° 


aurclientt.blogspot.com, D 


blogsforbush.com, R 


lashawnbarbcr.com, D 


8° 


tbogg.blogspot.com, D 


washingtonmonthly.com, D 


gevkaffeegal.typepad.com/thealliancc, R 


9° 


ncwlcftblogs.blogspot.com, D 


michcllemalkin.com, R 


presidentboxer.blogspot.com, R 


10° 


atrios.blogspot.com, D 


truthlaidbear.com, R 


corrente.blogspot.com, D 



TABLE III: Top Ten of the network of political blogs according to PR, aEV, authorities, hubs, indegree and outdegree. D 
democratic, R, republican. 



Measures 


r 


PR-aEV 


0.189 


PR-AUTH 


0.079 


PR-HUBS 


0.060 


PR-IN 


0.155 


PR-OUT 


0.090 


qEV-AUTH 


0.081 


qEV-IN 


0.147 


qEV-HUBS 


0.074 


qEV-OUT 


0.086 


AUTH-IN 


0.046 


AUTH-HUBS 


0.109 


AUTH-OUT 


0.072 


HUBS-IN 


0.003 


HUBS-OUT 


0.056 


IN-OUT 


0.081 



TABLE IV: Kendall's r for each pairs of centrality measures 
for the domain .gov of the Web. 
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Rank 


PR 


aEV 


1° 


www.usgs.gov 


polar.wwb.noaa.gov/waves/main_int.js 


2° 


www.nws.noaa.gov 


polar.wwb.noaa.gov/waves/wclcomc.html 


3° 


www. naca. larc. nasa.gov / readme, html 


polar . wwb. noaa.gov / waves / main_tablc. lit ml 


4° 


www.usda.gov 


polar .wwb. noaa.gov / waves / products .html 


5° 


www.nws.noaa.gov/disclaimer.html 


polar . wwb . noaa.gov / waves /main_int . ht ml 


6° 


www . ar . inel . gov/ home, htm 


www. nws . noaa. gov/disclaimer 1 . ht ml 


7° 


www.4womaii.gov/search/search.cfm 


www.nws.noaa.gov 


8° 


www.nws.noaa.gov/feedback.shtml 


polar.wwb.noaa. gov / waves /references, htm 


9° 


www.acccss.wa.gov 


poIar.wwb.noaa.gov/waves/validation.htm 


10° 


www.uslnfo.state.gov/products/pdq/pdq.htm 


polar.wwb.noaa.gov/waves/valid_wna.html 


Rank 


Auth 


In 


1° 


www.srh.iioa,a.gov/()uii/cgi-biii/wxclick.y)l?comitA'=()klahoiiia 


www.usgs.gcn' 


2° 


\v\vw.siii.iioaa.go\7'ouii/cgi-ljiii/wxclick.i>l'.''couiily=clevc.'iaiid 


\v\v\v.cdi.:.gtj\' 


3° 


www.srh.noaa.gov/oun/cgi-bin/wxclick.pl?county=kiowa 


www.usda.gov 


4° 


w w w . nws . noaa. gov 


www.doi.gov 


5° 


www.srh.noaa.gov/oun/cgi-bin/wxclick.pl?county=logan 


w w w . nws .noaa. gov 


6° 


www. srh. noaa. gov/oun/cgi-bin/wxclick.pl?county=payne 


www. usgs.gov / disclaimer .html 


7° 


www. srh. noaa. gov/oun/cgi-bin/wxclick.pl?county=knox 


www.usda.gov/news/privacy.htm 


8° 


weather.noaa.gov 


www.abag.ca.gov 


9° 


weather.noaa.gov/weather/ok_cc_us.html 


www.ars.usda.gov/nodisc.html 


10° 


www.crh.noaa.gov/ddc 


www.ars.usda.gov/comm.htm 



TABLE V: Top ten of the web domain . gov according to PR, aEV, authorities and indegree. 



